Gyeonggi-do
- South America > Paraguay > Asunción > Asunción (0.04)
- North America > Canada (0.04)
- Asia > South Korea > Gyeonggi-do > Suwon (0.04)
- North America > Canada (0.04)
- Asia > South Korea > Gyeonggi-do > Suwon (0.04)
- Asia > South Korea > Gyeonggi-do > Suwon (0.04)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- North America > Canada (0.04)
- Asia > South Korea > Gyeonggi-do > Suwon (0.04)
- North America > Canada (0.04)
- Asia > South Korea > Gyeonggi-do > Suwon (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Asia > South Korea > Gyeonggi-do > Suwon (0.04)
Text-Printed Image: Bridging the Image-Text Modality Gap for Text-centric Training of Large Vision-Language Models
Yamabe, Shojiro, Waseda, Futa, Shiono, Daiki, Takahashi, Tsubasa
Recent large vision-language models (LVLMs) have been applied to diverse VQA tasks. However, achieving practical performance typically requires task-specific fine-tuning with large numbers of image-text pairs, which are costly to collect. In this work, we study text-centric training, a setting where only textual descriptions are available and no real images are provided, as a paradigm for low-cost data scaling. Unlike images, whose collection is often restricted by privacy constraints and scarcity in niche domains, text is widely available. Moreover, text is easily editable, enabling automatic diversification and expansion with LLMs at minimal human effort. While this offers clear advantages over image collection in terms of scalability and cost, training on raw text without images still yields limited gains on VQA tasks because of the image-text modality gap. To address this issue, we propose a Text-Printed Image (TPI), which generates synthetic images by directly rendering the given textual description on a plain white canvas. This simple rendering projects text into the image modality and can be integrated into arbitrary existing LVLM training pipelines at low cost. Moreover, TPI preserves the semantics of the text, whereas text-to-image models often fail to do. Across four models and seven benchmarks, our systematic experiments show that TPI enables more effective text-centric training than synthetic images generated by a diffusion model. We further explore TPI as a low-cost data-augmentation strategy and demonstrate its practical utility. Overall, our findings highlight the significant potential of text-centric training and, more broadly, chart a path toward fully automated data generation for LVLMs.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- Asia > South Korea > Gyeonggi-do > Suwon (0.04)
- Asia > Middle East > Jordan (0.04)
- (2 more...)
- Information Technology (0.46)
- Health & Medicine (0.46)
A CNN-Based Technique to Assist Layout-to-Generator Conversion for Analog Circuits
Jeong, Sungyu, Kim, Minsu, Kim, Byungsub
We propose a technique to assist in converting a reference layout of an analog circuit into the procedural layout generator by efficiently reusing available generators for sub-cell creation. The proposed convolutional neural network (CNN) model automatically detects sub-cells that can be generated by available generator scripts in the library, and suggests using them in the hierarchically correct places of the generator software. In experiments, the CNN model examined sub-cells of a high-speed wireline receiver that has a total of 4,885 sub-cell instances including different 145 sub-cell designs. The CNN model classified the sub-cell instances into 51 generatable and one not-generatable classes. One not-generatable class indicates that no available generator can generate the classified sub-cell. The CNN model achieved 99.3% precision in examining the 145 different sub-cell designs. The CNN model greatly reduced the examination time to 18 seconds from 88 minutes required in manual examination. Also, the proposed CNN model could correctly classify unfamiliar sub-cells that are very different from the training dataset.
- Asia > South Korea > Gyeongsangbuk-do > Pohang (0.05)
- North America > United States > Oregon > Washington County > Hillsboro (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (2 more...)
- Workflow (0.94)
- Research Report (0.64)
Probabilistic Wildfire Spread Prediction Using an Autoregressive Conditional Generative Adversarial Network
Climate change has intensified the frequency and severity of wildfires, making rapid and accurate prediction of fire spread essential for effective mitigation and response. Physics-based simulators such as FARSITE offer high-fidelity predictions but are computationally intensive, limiting their applicability in real-time decision-making, while existing deep learning models often yield overly smooth predictions that fail to capture the complex, nonlinear dynamics of wildfire propagation. This study proposes an autoregressive conditional generative adversarial network (CGAN) for probabilistic wildfire spread prediction. By formulating the prediction task as an autoregressive problem, the model learns sequential state transitions, ensuring long-term prediction stability. Experimental results demonstrate that the proposed CGAN-based model outperforms conventional deep learning models in both overall predictive accuracy and boundary delineation of fire perimeters. These results demonstrate that adversarial learning allows the model to capture the strong nonlinearity and uncertainty of wildfire spread, instead of simply fitting the pixel average. Furthermore, the autoregressive framework facilitates systematic temporal forecasting of wildfire evolution. The proposed CGAN-based autoregressive framework enhances both the accuracy and physical interpretability of wildfire spread prediction, offering a promising foundation for time-sensitive response and evacuation planning.
- North America > United States > California > Los Angeles County > Los Angeles (0.04)
- Asia > South Korea > Gyeonggi-do > Suwon (0.04)
- South America > Brazil (0.04)
- (6 more...)